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11338 (NCR.0120US) 

RESCHEDULING OF MODIFICATION OPERATIONS FOR 
LOADING DATA INTO A DATABASE SYSTEM 

BACKGROUND 

[0001] A database is a collection of stored data that is logically related and that is 
accessible by one or more users. A popular type of database is the relational database 
management system (RDBMS), which includes relational tables made up of rows and 
columns. Each row represents an occurrence of an entity defined by a table, with an 
entity being a person, place, or thing about which the table contains information. To 
extract data from, or to update, a relational table, queries according to a standard database 
query language (e.g., Structured Query Language or SQL) are submitted to the database 
system. A table (also referred to as a relation) is made up of multiple rows (also referred 
to as tuples). Each row (or tuple) includes multiple columns (or attributes). 

[0002] A popular application of database management system is data warehousing. A 
data warehouse is basically a collection of data received from various sources. One 
example type of data warehouse is in the retail context, where information regarding 
shopping habits, histories, and other information of customers are collected and stored in 
the data warehouse. Traditionally, new data is loaded into a data warehouse in batches at 
regular time intervals (e.g., every night). As a result, enterprises often make decisions 
based on stale data. 

[0003] To address this issue, operational data warehouses are used. In an operational 
data warehouse, new data is loaded into the data warehouse in real time (or close to real 
time). Typically, a load utility is used to continually load data into the data warehouse. 
Data loading is accomplished by use of update transactions submitted to the database 
management system in which the data warehouse is implemented. 

[0004] An issue associated with database system operation is the occurrence of deadlock 
among different transactions. The occurrence of deadlock is even more likely when there 
are multiple transactions modifying base relations of a materialized view. A materialized 
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view stores a pre-computed query result that can be used for satisfying certain subsequent 
queries. Computing the result from a materialized view is generally more 
computationally efficient than computing the result from base relation(s). 

[0005] One type of a materialized view is a join view, which stores results of a join of 
multiple base relations. In a procedure referred to as materialized view maintenance, a 
materialized view is updated when underlying base relations are modified. As base 
relations are changed through insertion of rows, deletion of rows, or updating of rows, the 
corresponding rows of the materialized view are also modified. 

[0006] To maintain data consistency, each transaction in a database system typically 
places some type of lock on relations and materialized views that are involved in the 
transaction. In some scenarios, the conflicting locks for multiple transactions are placed 
on the relations and materialized views in such an order that none of the multiple 
transactions can proceed further — a deadlock condition. A deadlock among transactions 
reduces the ability of transactions to successfully complete in a database system. 

[0007] In an operational data warehouse, a load utility is continually loading data into a 
database system, which causes generation of a relatively large number of transactions to 
perform the data loading. The occurrence of deadlock conditions during the loading of 
data in an operational data warehouse will cause data warehouse performance to suffer. 

SUMMARY 

[0008] In general, methods and apparatus are provided to reschedule (partition and/or re- 
order) transactions associated with loading data into a database system to reduce 
likelihood of deadlocks. For example, a method for use with a database system that 
stores a join view associated with plural base relations includes receiving modification 
operations that modify at least two of the base relations of the join view. The received 
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modification operations are re-ordered to avoid execution of modification operations of 
more than one of the at least two base relations at one time in the database system. 

[0009] Other or alternative features will become apparent from the following description, 
from the drawings, and from the claims. 

BRIEF DESCRIPTION OF THE DRAWING 
[0010] Fig. 1 is a block diagram of an arrangement that includes an example database 
system and servers having respective load utilities. 

[001 1] Fig. 2 is a flow diagram of a rescheduling process, according to one embodiment. 

[0012] Fig. 3 illustrates the re-ordering of modification operations in the rescheduling 
process of Fig. 2. 

[0013] Fig. 4 is a conflict graph to illustrate performance of data loading by multiple load 
utilities. 

[0014] Fig. 5 is a flow diagram of a process of applying locks to relations in a database 
system. 

DETAILED DESCRIPTION 
[0015] In the following description, numerous details are set forth to provide an 
understanding of the present invention. However, it will be understood by those skilled 
in the art that the present invention may be practiced without these details and that 
numerous variations or modifications from the described embodiments are possible. 

[0016] Fig. 1 shows an example arrangement that includes a database system 10, which 
can be a relational database management system (RDBMS). The database system 10 is a 
parallel database system having a plurality of data server modules 12, each responsible 
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for managing access to or modification of data stored in respective storage modules 14. 
Examples of the responsibilities of each data server module (also referred to as "an access 
module") include locking databases, tables, or portions of tables; creating, modifying, or 
deleting definitions of tables; inserting, deleting, or modifying rows within tables; 
retrieving information from definitions and tables; and so forth. The data server modules 
12, after executing an action, also return responses to the requesting client. In one 
example implementation, the data server modules 12 are based on access module 
processors (AMPs) in TERADATA® database systems from NCR Corporation. 

[0017] The database system 10 is coupled to a server 28, which in turn is coupled over a 
data network 16 to one or plural client systems 18. The client systems 18 are capable of 
issuing queries over the data network 16 to the database system 10. The server 28 
includes a load utility 20, which is responsible for loading data received from various 
sources (such as client systems 18) into the database system 10. The arrangement of Fig. 
1 also includes an additional server 29, and a load utility 21 running on the server 29. 

[0018] In accordance with some embodiments of the invention, the load utility 20 is also 
responsible for rescheduling modification operations (e.g., operations specified by 
Structured Query Language (SQL) statements) received from the client systems 18 prior 
to submission of modification operations to the database system 10. Although reference 
is made to SQL in the described embodiments, it is contemplated that other embodiments 
can employ statements according to other database query languages. 

[0019] The rescheduling of modification operations performed in accordance with some 
embodiments involves the re-ordering of the loading of data into the database system 10 
such that, at any time, for any materialized join view, data is loaded into only one of the 
base relations making up the join view. To avoid deadlock, the load utility 20 seeks to 
avoid updating more than one base relation of a join view at the same time. 
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[0020] The re-ordering of modification operations is used in conjunction with a 
partitioning technique, described further below, to separate "conflicting" modification 
operations so that such conflicting modification operations do not execute concurrently. 

[0021] In addition to the re-ordering and partitioning of modification operations, pre- 
aggregation is also performed to reduce the number of SQL statements that are submitted 
in load transactions. By pre-aggregating SQL statements, more SQL statements can be 
executed in each given transaction so that the number of transactions can be reduced. 
Each transaction is associated with transaction overhead. Pre-aggregation reduces the 
amount of overhead associated with transactions. 

[0022] According to one implementation, the load utility 20 performs "continuous" 
loading of data into the database system. The load utility 20 receives modification 
operations (insert, delete, or update) from several data sources, represented as client 
systems 18 in Fig. 1. The term "continuous loading" as used in the continuous loading 
context does not necessarily refer to continuous loading that is uninterrupted in time. 
Rather, the term "continuous loading" refers to the fact that data is being loaded into the 
database system 10 more or less in real time (that is, as the data become available and as 
the data sources make such data available for loading into the database system 10). As 
discussed earlier, one application of continuous loading is in operational data 
warehousing systems, where data from the data sources are loaded into a central data 
warehouse (stored on the database system 10) in real time (or substantially in real time). 
Another application of continuous data loading is a data stream application, in which 
streams of data are provided from multiple sources to the database system. 

[0023] The query originated by a client system 18 and forwarded by the load utility 20 is 
received by one or plural parsing engines 22 in the database system 10. Each parsing 
engine includes a parser 24 and a scheduler 26. The parser 24 checks a received request 
for proper syntax and semantically evaluates the request. The parser 24 also includes an 
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optimizer that develops an execution plan for received requests. The execution plan 
includes a sequence of executable steps that are communicated by the scheduler 26 to one 
or more of the data server modules 12 for execution. The parsing engine(s) 22 and data 
server modules 12 of the database system 10 are part of a database engine, which can be 
implemented in software, hardware, or a combination of both. 

[0024] The load utility 20 is considered to be separate from the database engine of the 
database system 10, even if the load utility 20 is running in the database system 10. The 
load utility 20 differs from the database engine in that the load utility does not access data 
objects or other data structures (e.g., tables, materialized views, etc.) stored in the 
database system 10. The load utility 20 performs designated pre-processing tasks with 
respect to modification operations and/or queries submitted by client systems. The 
modification operations and/or queries, after pre-processing, are submitted by the load 
utility 20 to the database engine to perform the actual retrieval or manipulation of 
database objects, such as table rows, columns, tables, views, and so forth. 

[0025] To access the database system, each client system 18 establishes one or more 
sessions with the server 28 and/or database system 10. A "session" refers to activity by a 
user or application over some period of time. By opening multiple sessions to the 
database system 10, concurrency is enhanced. Each session can be maintained for a 
relatively long time so that the session does not need to be re-established for each use. 

[0026] The load utility 20 is able to perform partitioning of modification operations such 
that modification operations operating on the same tuples are sent through the same 
session. This partitioning reduces the occurrence of deadlocks due to modification 
operations in multiple sessions performing modification of the same tuples. 
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[0027] For the purpose of efficiency, within a transaction, SQL statements can be pre- 
compiled into a stored procedure that is submitted by the load utility 20 to the database 
system 10. 

[0028] Fig. 2 illustrates a process according to some embodiments for improving 
database performance in processing SQL statements that are submitted by client systems 
18 to the load utility 20. In some embodiments, the process depicted in Fig. 2 is 
performed by the load utility 20 executed in the server 28. In other embodiments, the 
process of Fig. 2 can be performed by a load utility in the database system, by the parsing 
engine 22, or by some other software or hardware module. 

[0029] The load utility 20 receives (at 102) queries, such as in the form of SQL 
statements, from the client systems 18. For improved performance, and to reduce 
likelihood of deadlocks, the load utility 20 partitions (at 104) the modification operations 
specified by the received queries such that modification operations on the same tuple are 
submitted through the same session. As noted above, the load utility 20 can open 
multiple sessions to the database system 10 to perform transactions. Suppose the load 
utility 20 opens k > 2 sessions Si (1 < i < k) to the database system. If modification 
operations are randomly distributed among the k sessions, transactions from different 
sessions can easily deadlock on their X lock requests on the base relations. An X lock is 
an exclusive lock placed on a table or portion of a table (such as a tuple) when one 
transaction is updating the table or table portions, to prevent access of the table or table 
portion by another transaction. 

[0030] The following example invokes a single base relation R and the following four 
operations (a tuple refers to a row of a table): 
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Oi: Update tuple ti in base relation R. 

O2: Update tuple t2 in base relation R. 

O3: Update tuple t2 in base relation R. 

O4: Update tuple ti in base relation R. 

[0031] These operations require the following tuple-level locks on base relation R: 
Oi : A tuple-level X lock on R for tuple ti. 
O2: A tuple-level X lock on R for tuple t2. 
O3: A tuple-level X lock on R for tuple t2. 
O4: A tuple-level X lock on R for tuple ti . 

[0032] Suppose operations Oi and O2 are part of transaction Ti that is sent through 
session Si. Operations O3 and O4 are part of transaction T2 that is sent through session 
S2. If transactions Ti and T 2 are executed in the order 

Ti executes Oi, 

T2 executes O3, 

Ti executes O2, 

T2 executes O4, 

then a deadlock will occur. This is because both operations Ot and O4 require a tuple- 
level X lock on R for tuple ti. Also, both operations O2 and O3 require a tuple-level X 
lock on R for tuple t 2 . 

[0033] A simple solution to the above deadlock problem is to partition (e.g., hash) the 
tuples among different sessions so that modification operations on the same tuple are sent 
through the same session. In this way, the deadlock condition (transactions from different 
sessions modifying the same tuple) can be avoided. Effectively, the transactions that 
operate on the same set of one or more tuples are identified and re-allocated (partitioned) 
to the same session. 
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[0034] After partitioning, the modification operations are separated into multiple 
sessions. Within each session, n modification operations on base relation Ri, where Rj is 
one of the base relations of a materialized join view, are grouped (at 106) into a 
transaction. In this discussion, it is assumed that the database system 10 maintains a join 
view JV, defined by 7i(a(Ri MR 2 tx . . . MR h )), or an aggregate join view, defined by 
y(7i(a(Ri cxR 2 M . . . cxR h ))), where y is an aggregate operator. The grouping of n (where n 
is some defined value) modification operations into a single transaction helps reduce the 
number of transactions submitted in each session, and thus reduces transaction overhead. 

[0035] Note that each transaction contains modification operation on the same base 
relation Rj. The transactions on respective base relations (or a materialized join view) are 
then scheduled (at 108) for execution in the database system 10 such that multiple 
transactions on different base relations of the same materialized join view are not 
concurrently executed. The grouping and scheduling performed at 106 and 108 are part 
of the re-ordering technique according to some embodiments that reduces likelihood of 
deadlock. The re-ordering technique is discussed in greater detail below. 

[0036] To further enhance concurrency, a Y-lock locking mechanism is applied (at 1 10) 
by the database system 10 to assert Y locks on materialized views during transactions that 
update the materialized views. A Y lock has similar functionality as an X lock, with the 
major difference being that the Y lock does not conflict with itself. In other words, the Y 
lock is a modified type of exclusive lock placed on a join view that allows subsequent Y 
locks to be placed on the same join view even though a first Y lock is still active on the 
join view. In contrast, a table-level X lock placed on a join view blocks a subsequent X 
lock on the same join view. By enabling multiple Y locks to be concurrently placed on 
the same join view, throughput of transactions in the database system 10 can be greatly 
increased, as the possibility of lock conflicts on the join views among different 
transactions is reduced. A Y lock is also referred to as a "modified-exclusive" lock. 
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[0037] Normally, in continuous data loading, data is allowed to be loaded into multiple 
base relations concurrently to keep the data in the database system 10 as up-to-date as 
possible. However, if a join view is defined on multiple base relations, a deadlock can 
occur, since a join view JV links different base relations together. When a base relation 
of JV is updated, to maintain JV, all the other base relations of JV are read. These reads 
can conflict with other concurrent writes to the other base relations of JV. For example, 
consider the following two base relations: A(a, c) and B(d, e). Suppose a join view 
JV=AIXB is defined on A and B, where the join condition is A.c=B.d. Consider the 
following two modification operations: 

(1) Oi : Modify a tuple t\ in base relation A whose c=v. 

(2) 0 2 : Modify a tuple t 2 in base relation B whose d=v. 

These modification operations require the following tuple-level locks on base relations A 
andB: 

Oi : (Li i) A tuple-level X lock on A for tuple U . 

(Li 2 ) Several tuple-level S locks on B for all the tuples in B whose d=v 

(for join view maintenance purpose). 
O2: (L21) A tuple-level X lock on B for tuple t 2 . 

(L22) Several tuple-level S locks on A for all the tuples in A whose c=v. 

[0038] Suppose operation Oi is executed by transaction Ti through session Si, and 
operation O2 is executed by transaction T 2 through session S 2 . If transactions Ti and T 2 
request the locks in the following order: 

(1) step 1: Ti requests Ln, 

(2) step 2: T2 requests L 2 i, 

(3) step 3: Ti requests L12, 

(4) step 4: T 2 requests L 22 , 
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a deadlock occurs. This is because Ln (L22) contains a tuple-level X (S) lock on A for 
tuple ti. Also, L 2 i (Li 2 ) contains a tuple-level X (S) lock on B for tuple t 2 . An S lock is 
placed when reading a tuple to lock subsequent transactions from writing to the tuple. 

[0039] Thus, deadlock occurs if data is allowed to be loaded into multiple base relations 
of the same join view concurrently. However, if concurrent loading of data into multiple 
base relations of a join view is not allowed, then deadlock can be avoided if the following 
rules are satisfied: 

Rule 1 : At any time, for any join view JV, data can only be loaded into one base 
relation of JV (re-ordering technique corresponding to acts 106 and 108 in Fig. 2). 

Rule 2: Modification operations on the same base relation use the partitioning 
method (act 104 in Fig. 2) discussed above. 

Rule 3: The system uses a high concurrency locking protocol (e.g., the Y-lock 
locking mechanism mentioned above for placing Y locks) on join views so that 
lock conflicts on the join views can be reduced (act 1 10 in Fig. 2). 

[0040] By adhering to rules 1 and 2 above, deadlocks resulting from lock conflicts on the 
base relations are avoided. By following rule 3, deadlocks resulting from lock conflicts 
on the join views can be ignored, since Y locks according to the Y-lock locking 
mechanism are compatible with themselves. 

[0041] To apply rule 1, the re-ordering technique is used to reorder modification 
operations. As shown by Fig. 3, consider a database with d base relations Ri, R2, . . and 
R<j and e join views JVi, JV 2 , . . ., and JV e . An array J that contains d elements Ji (1 <i < 
d) is maintained for each corresponding base relation Ri, R2, . . and Ra. For each i (1 <i 
<d), Ji records the number of currently executing transactions that modify base relation 
Ri. Each Ji (1 <i ^d) is initialized to zero. Another data structure that is maintained is a 
queue Q, one for each session. There are k sessions active in the database system. For 
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each m (1 <m <k), a queue Q m is maintained for recording transactions waiting to be run 
through session S m . Each Q m (1 ^m ^k) is initialized to empty. Note that Jj maintains a 
count of the number of transactions on base relation R\ over all sessions S m (1 <m <k). 

[0042] If base relations R\ and Rj (1 <i, j <d, i are base relations of the same join 
view, then base relations R { and Rj are said to conflict with each other. Two transactions 
modifying conflicting base relations are said to conflict with each other. A transaction T 
is called a desirable transaction if it does not conflict with any currently running 
transaction. Consider a particular base relation Ri (1 <i <d). Suppose r , r^, ...,and 

r s (w >0) are all the other base relations that conflict with base relation Rj. At any time, 

if either w=0 (there are no other base relations that conflict with Rj) or all the = o (1 ^ 

u <w) (there are no running conflicting transactions), then a transaction T modifying base 
relation R* (1 <i <d) is indicated as being a desirable transaction. 

[0043] Transactions are scheduled in the following way. For each session S m (1 <m < 
k), whenever the continuous load utility has collected n modification operations on a base 
relation R\ (1 <i ^d), these operations are combined (or grouped, act 106 in Fig. 2) into a 
single transaction T, which is inserted into the end of queue Q m . In Fig. 3, a transaction 
Ti that operates on relation R } is stored in queue Q 2 , a transaction T 2 that operates on 
relation R\ is stored in queue Qk, a transaction T3 that operates on relation R 2 is stored in 
queue Qk, and a transaction T4 that operates on relation R3 is stored in queue Qi. 
Transaction Ti contains modification operations on a first tuple of R[, while transaction 
T 2 contains modification operations on a second tuple of R\. The partitioning of Ti and 
T 2 is performed by the load utility 20, as discussed above. Because the transactions Ti 
and T 2 modify different tuples of R is the transactions Ti and T 2 can be scheduled to 
execute concurrently on the database system 10 without causing deadlock. 
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[0044] If session S m is free (that is, a transaction is not currently executing in session S m ), 
a transaction is scheduled to the database system 10 for execution through 'session S m , 
provided that any other executing transaction in any other session S p (1 <p ^k, p tti) 
does not conflict with the transaction in session S m . 

[0045] When some transaction T modifying base relation Ri (1 <i <d) finishes execution 
and frees session S m (1 <m <k), the following operations are performed: 

(a) Ji (the count representing the number of transactions modifying relation Ri) is 
decremented by one. Note that J represents a count of the number of actively running 
transactions, and not the number of pending transactions in a queue Q. 

(b) If Q m is not empty, another transaction is scheduled to the database system 10 for 
execution through session S m . 

(c) If Ji is decremented to zero (the number of executing transactions modifying base 
relation Ri is zero), then previously conflicting transactions that are pending can now 
be scheduled for execution. For each g (1 <g ^c, g ;*m), if session S g is free and Q g 
is not empty, a transaction is scheduled to the database system for execution through 
session S g . 

[0046] Whenever it is desirable to schedule a transaction to the database system 10 for 
execution through session S m (1 ^k), the following operations are performed. The 
queue Q m is searched sequentially until either a desirable transaction T is found or all the 
transactions in Q m have been scanned, whichever comes first. In the case that a desirable 
transaction T modifying base relation Ri (1 <i <d) is found, Ji is incremented by one and 
transaction T is sent to the database system 10 for execution. 

[0047] In the above discussion, starvation of transactions is not considered. If starvation 
needs to be considered, then a starvation prevention technique is defined that is integrated 
into the transaction re-ordering method. For example, a special header transaction can be 
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used to prevent the first transaction in any queue Q g (1 <g ^k) from starving. A pointer r 
whose value is between 0 and k is maintained. The pointer r is initialized to 0. If every 
Q m (1 <m <k) is empty, then set r = 0. At any time, if r = 0 and a transaction is inserted 
into some Q m (l <m <k), then set r = m. If r = m (1 <m ^k) and the first transaction of 
Q m leaves Q m for execution, r is incremented by one (if m = k, then set r = 1). If Q r is 
empty, r is incremented until either Q r is not empty or it is discovered that every Q m (1 < 
m <k) is empty. In the latter case, set r = 0. 

[0048] A pre-defined time period TS is specified by a user based on application 
requirement. If pointer r has stayed at some value v (1 <v <k) longer than TS, the first 
transaction of Q v becomes the header transaction. The header transaction is identified as 
a transaction that has stayed too long (longer than TS) in a queue Q. When a header 
transaction is present, the load utility 20 modifies its scheduling algorithm to give 
preference to the header transaction so that the header transaction can be scheduled for 
execution sooner. One technique to do this is to avoid scheduling other transactions that 
conflict with the header transaction for execution. Whenever the load utility 20 is 
searching for a desirable transaction in some Q m (1 <m <k), and the load utility 20 finds 
transaction T, if the header transaction exists, the load utility 20 ensures that either T is 
the header transaction or T does not conflict with the header transaction. Otherwise 
transaction T is indicated as not desirable and the load utility 20 will continue the search. 

[0049] In the discussion above, it is assumed that only one multi-session continuous load 
utility is running on a platform (such as the server 28). Suppose the server 28 becomes a 
bottleneck. To reduce competition for resources, it may be desirable to run multiple 
continuous load utilities, each on a different platform (shown as additional servers 29 in 
Fig. 1 with respective load utilities 21). In this case, it is desired to prevent deadlocks 
among transactions submitted by different continuous load utilities. To achieve this, the 
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following conflict graph partitioning method can be used to extend the re-ordering 
method. 



[0050] Consider a database with d base relations Ri, R 2 , . . and Rd and e join views JV b 
JV 2 , . . ., and JV e . The conflict graph of this database is constructed with d nodes. Node i 
represents base relation Ri (1 <i <d). For each join view JVj (1 <j <e), suppose that it is 
defined on tj (2 <t ^d) of the d base relations. For each two of the tj base relations, an 
edge is drawn between the two nodes representing these two base relations. If there are f 
(f > i) edges between two nodes, only one edge is kept between them, where the weight of 
the edge is f. 

[0051] For example, suppose that a database contains base relations A, B, C, D, E, and F, 
and join views JVi=Ao«BmC, JV 2 =CmD, and JV 3 =EmF. In this example, the conflict 
graph is shown in Fig. 4. 

[0052] The conflict graph is partitioned into connected components (component 150 and 
component 152 illustrated in Fig. 4). All base relations in the same connected component 
are processed by the same continuous load utility. Different connected components may 
share the same continuous load utility. This can be achieved by letting the continuous 
load utility provide instructions or suggestions to the database administrator when the 
database administrator sets up the continuous load utilities in how components of a 
conflict graph are to be distributed across plural load utilities. In this way, deadlocks 
among transactions submitted by different continuous load utilities can be avoided, since 
no join view is defined on the base relations in different connected components. 

[0053] The above conflict graph partitioning method may not always be the optimal 
approach. For example, consider the example shown in Fig. 4. According to the conflict 
graph partitioning method, A and B should be dealt with by the same continuous load 
utility. Assume that tuples of A and tuples of B arrive at the same rate. Also, assume that 
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a single platform (such as server 28) has only enough resources to handle either A or B, 
but not both of them. There are two possible cases: (1) in the case of high deadlock 
probability, a single continuous load utility is used to handle both A and B, even if this 
means that the single platform becomes a bottleneck; and (2) in the case of low deadlock 
probability (e.g., if the arriving tuples of A and the arriving tuples of B have different join 
attribute values), two continuous load utilities may be used, each on a different platform. 
One load utility handles A, while another load utility handles B. Although this technique 
does not conform with the conflict graph partitioning method, this technique leads to 
improved throughput, as the benefits of using two platforms will substantially outweigh 
the problems caused by a small number of deadlocks. 

[0054] The Y-lock locking mechanism is discussed below. In one embodiment, three 
types of table-level locks are allowed on a join view (JV): Y locks, S locks, and X locks. 
The conditions under which such table-level locks are allowed are shown by the logic of 
Fig. 5. The logic shown in Fig. 5 is performed by a data server module 12 in the database 
system 10 of Fig. 1. If more than one data server module 12 is involved in a particular 
transaction, then each data server module 12 performs the acts of Fig. 5 concurrently. 

[0055] Upon receiving steps associated with a transaction, the data server module 12 first 
determines (at 202) if the transaction specifies both a read and write of the join view JV. 
As used here, writing a view is also referred to as updating or modifying the view. If the 
transaction specifies both a read and write of JV, then the data server module 12 
determines (at 204) if an S lock, X lock, or Y lock is currently active on the join view JV. 
If none of an S lock, X lock, or Y lock is active on the join view JV, the data server 
module 12 places (at 206) an X lock on the join view JV. Note that the X lock required 
for the received transaction conflicts with any of an X lock, S lock, or Y lock, so that the 
presence of any such table-level lock on JV blocks the required X lock for the received 
transaction. 
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[0056] If the data server module 12 determines (at 204) that any of the S lock, X lock, or 
Y lock is active on JV, the data server module 12 performs (at 208) conflicting lock 
processing. For example, the data server module 12 can wait until the current conflicting 
lock has been released. Different embodiments perform different tasks in response to 
detecting that a received transaction cannot proceed because of conflicting locks placed 
on the join view JV. 

[0057] If the transaction is determined not to involve both a read and write of the join 
view JV (at 202), the data server module 12 checks (at 210) if the transaction involves a 
read (but not a write) of the join view. If so, the data server module 12 checks (at 212) if 
there is either a Y lock or an X lock on JV. If neither a Y lock nor X lock is currently 
active on JV, the data server module 12 places (at 214) an S lock on the join view JV . 
Note that the required S lock conflicts with either a Y lock or X lock, so that the presence 
of either the Y lock or X lock blocks acquisition of the S lock for the received transaction. 
If either a Y lock or X lock is presently active on JV, the data server module 12 processes 
(at 208) the conflicting lock condition. 

[0058] If the transaction does not involve a read of the join view JV, the data server 
module checks (at 216) if the transaction involves a write (but not a read) of the join view 
JV. If so, the data server module 12 checks (at 218) if there is either an S lock or an X 
lock on JV. If not, then the data server module 12 places (at 220) a Y lock on the join 
view JV. However, if either an S lock or an X lock is presently active on JV, the data 
server module 12 processes (at 208) the conflicting lock condition. 

[0059] The discussion above refers to placing a table-level Y lock on a join view. The Y- 
lock locking mechanism for join views is extendable to also allow value locks on join 
views (tuple-level lock). Consider a join view JV defined on base relations Ri, R2, . . ., 
and R„. For a fixed i (1 < i < n), suppose that R.ai is an attribute of base relation R that 
also appears in JV. Then X and S value locking on R.aj for JV is allowed. For example, 
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consider a transaction T that only updates base relation Ri. If the update to base relation 
Ri specifies the value(s) of Ri.ai, then transaction T can also place an IY lock on JV and 
one or several X value locks (not Y value locks) on Ri.ai for JV. If transaction T reads JV 
by specifying the Ri.a* value(s), then transaction T can put an IS lock on JV and one or 
several S value locks on R^a* for JV. If transaction T both reads and updates JV by 
specifying the R^a* value(s), then transaction T can put an DC lock on JV and one or 
several S value locks and X value locks on Rj.ai for JV. 

[0060] The IY lock is similar to the traditional DC lock except that it is compatible with a 
Y lock or another IY lock. As with the DC lock, the table-level IY lock is placed on the 
join view JV in conjunction with an X value lock of JV to indicate to subsequent 
transactions that table-level X or S locks on JV are blocked (however, a table-level Y or 
IY lock on JV is still possible in the presence of the IY lock with X value lock). Also, a 
subsequent IS lock with an S value lock on JV is allowed in the presence of an IY lock 
with X value lock on JV. Note, however, that an X value lock on JV is incompatible with 
an S value lock placed on the same attribute value of JV. 

[0061] For a transaction that performs both a (table-level) read and (value) write of the 
join view JV, both an S lock and X value lock are needed on the join view JV. In this 
case, a table-level SIY lock (which is equivalent to an S lock and an IY lock) is placed on 
JV. The SIY lock is similar to the traditional SIX lock. One can think that DC = IS + IY. 
An SIX lock is equivalent to an S lock and an DC lock (for an X value lock). The SIY 
lock is only compatible with the IS lock. 

[0062] Note that SIX = S + DC = S + (IS + IY) = (S + IS) + IY = S + IY = SIY. Thus, the 
SIX lock is the same as the SIY lock. 

[0063] If transaction T both updates JV (without specifying the R*.aj value(s)), which is a 
table-write, and reads JV (specifying the Ri.ai value(s)), which is a value-read, then 
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transaction T requires both a Y lock and S value lock(s) on JV. In this case, a table-level 
YIS lock is played on JV (which is equivalent to a Y lock and an IS lock). The YIS lock 
(Y + IS) is similar to the SIX lock and is only compatible with the IY lock. 

[0064] The compatibilities of the different locks are listed in Table 1. 
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[0065] According to Table 1, a Y lock on JV is compatible with another Y lock or an IY 
lock on JV. However, the Y lock is incompatible with a table-level S lock, X lock, IS 
lock, DC lock, SIY lock, or YIS lock. Note that a table-level X lock is incompatible with 
any lock. An IY lock on JV is compatible with a table-level Y lock, IS lock, IY lock, DC 
lock, or YIS lock. However, an IY lock is incompatible with a table-level S lock, X lock, 
and SIY lock. An DC lock is compatible with an IS, IY, or DC lock, but not with any other 
locks. An SIY lock (S + IY lock) is compatible with an IS lock, but not with any other 
locks. A YIS lock (Y + IS lock) is compatible with an IY lock, but not with any other 
locks. Using the Y-lock locking mechanism, deadlocks associated with locks placed on 
join views are reduced. 

[0066] As discussed above, the deadlock probability associated with submitting 
concurrent transactions on multiple base relations of a join view can be high. The 
following provides a discussion of how the deadlock probability can be computed. For 
example, suppose the following conditions are present: there are k > 1 concurrent 
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transactions; each transaction contains n modification operations and modifies either A or 
B with probability p and 1-p, respectively; within a transaction, each modification 
operation modifies a random tuple in A (B) and each of the n tuples to be modified has a 
distinct (and random) A.c (B.d) value; there are in total s distinct values for A.c (B.d); 
and s » kn. 

[0067] Then, the probability that any particular transaction deadlocks is approximately 
p(l-p)(k-l)n 2 /(2s). If the condition s » kn is not true, then this deadlock probability is 
essentially 1. Hence, whether s » kn or not, a unified formula min(l, p(l-p)(k- 
l)n 2 /(2s)) can be used to roughly estimate the probability that any particular transaction 
deadlocks. 

[0068] This probability can be derived as follows. Consider a particular transaction T of 
the k transactions. There are two cases: 

Case 1: Transaction T modifies base relation A. From transaction T's. perspective, 
there are k-1 other transactions, where a 1-p fraction of them modify base relation 
B. Each of these (k-l)(l-p) transactions holds approximately n/2 sets of locks of 
the form L 2 \ and L 22 . Hence, these (k-l)(l-p) transactions hold (k-l)(l-p)n/2 sets 
of locks. For any modification operation MOi of transaction T, the probability that 
it deadlocks with some modification operation MO2 of another transaction T' is 
PWi=(k-l)(l-p)n/(2s)x(l/2)=(k-l)(l-p)n/(4s). This is because: 

(a) the probability that the tuples modified by MOi and M0 2 have the same 
value for A.c (B.d) is approximately (k-l)(l-p)n/(2s); and 

(b) in the case that the tuples modified by MOi and M0 2 have the same value 
for A.c (B.d), the probability that MOi and M0 2 deadlock is Vi (depending 
on whether or not step 2 occurs before step 3). 
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Transaction T contains n modification operations. Therefore, the probability that 
transaction T deadlocks is PWi(T)=l-(l-PW0WPWi==(k-l)(l-p)n 2 /(4s). 

Case 2: Transaction T modifies base relations B. In this case, following a 
reasoning that is similar to Case 1, it can be shown that the probability that 
transaction T deadlocks is PW 2 (T)«(k-l)pn 2 /(4s). 

[0069] Case 1 happens with probability p. Case 2 happens with probability 1-p. Hence, 
for any particular transaction T, the probability that transaction T deadlocks is 
PW(T)^xPWi(T)-Kl-p)xPW 2 (T)==p(l-p)(k-l)n 2 /(2s). For reasonable values of k, n, and 
s, this deadlock probability is relatively high. 

[0070] As an example, if p=50%, k=8, n=32, and s=10,000, this deadlock probability is 
approximately 9%. Doubling n to 64 raises this probability to 36%. For a larger n, the 
deadlock probability could easily get close to 1. 

[0071] However, by employing the partitioning and re-ordering techniques and Y-lock 
locking mechanism discussed above, the likelihood of deadlocks is reduced in a database 
system. Although all three techniques (partitioning technique, re-ordering technique, and 
Y-lock locking mechanism) have been discussed together, it is noted that less than all of 
such techniques can be applied and still achieve reduction of deadlocks. 

[0072] Instructions of the various software routines or modules discussed herein (such as 
the load utilities 20 and 21, parsing engine, and access modules, and so forth) are 
executed on corresponding control modules. The control modules include 
microprocessors, microcontrollers, processor modules or subsystems (including one or 
more microprocessors or microcontrollers), or other control or computing devices. As 
used here, a "controller" refers to hardware, software, or a combination thereof. A 
"controller" can refer to a single component or to plural components (whether software or 
hardware). 
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[0073] Data and instructions (of the various software routines or modules) are stored on 
one or more machine-readable storage media. The storage media include different forms 
of memory including semiconductor memory devices such as dynamic or static random 
access memories (DRAMs or SRAMs), erasable and programmable read-only memories 
(EPROMs), electrically erasable and programmable read-only memories (EEPROMs) and 
flash memories; magnetic disks such as fixed, floppy and removable disks; other 
magnetic media including tape; and optical media such as compact disks (CDs) or digital 
video disks (DVDs). 

[0074] The instructions of the software routines or modules are loaded or transported to a 
system in one of many different ways. For example, code segments including instructions 
stored on floppy disks, CD or DVD media, a hard disk, or transported through a network 
interface card, modem, or other interface device are loaded into the system and executed 
as corresponding software modules or layers. In the loading or transport process, data 
signals that are embodied in carrier waves (transmitted over telephone lines, network 
lines, wireless links, cables, and the like) communicate the code segments, including 
instructions, to the system. Such carrier waves are in the form of electrical, optical, 
acoustical, electromagnetic, or other types of signals. 

[0075] While the present invention has been described with respect to a limited number 
of embodiments, those skilled in the art, having the benefit of this disclosure, will 
appreciate numerous modifications and variations therefrom. It is intended that the 
appended claims cover all such modifications and variations as fall within the true spirit 
and scope of this present invention. 
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